Evaluation of Finite State Morphological Analyzers Based on Paradigm Extraction from Wiktionary
نویسندگان
چکیده
Wiktionary provides lexical information for an increasing number of languages, including morphological inflection tables. It is a good resource for automatically learning rule-based analysis of the inflectional morphology of a language. This paper performs an extensive evaluation of a method to extract generalized paradigms from morphological inflection tables, which can be converted to weighted and unweighted finite transducers for morphological parsing and generation. The inflection tables of 55 languages from the English edition of Wiktionary are converted to such general paradigms, and the performance of the probabilistic parsers based on these paradigms are tested.
منابع مشابه
Deriving Morphological Analyzers from Example Inflections
This paper presents a semi-automatic method to derive morphological analyzers from a limited number of example inflections suitable for languages with alphabetic writing systems. The system we present learns the inflectional behavior of morphological paradigms from examples and converts the learned paradigms into a finite-state transducer that is able to map inflected forms of previously unseen...
متن کاملZmorge: A German Morphological Lexicon Extracted from Wiktionary
We describe a method to automatically extract a German lexicon from Wiktionary that is compatible with the finite-state morphological grammar SMOR. The main advantage of the resulting lexicon over existing lexica for SMOR is that it is open and permissively licensed. A recall-oriented evaluation shows that a morphological analyser built with our lexicon has comparable coverage compared to exist...
متن کاملA Language-Independent Feature Schema for Inflectional Morphology
This paper presents a universal morphological feature schema that represents the finest distinctions in meaning that are expressed by overt, affixal inflectional morphology across languages. This schema is used to universalize data extracted from Wiktionary via a robust multidimensional table parsing algorithm and feature mapping algorithms, yielding 883,965 instantiated paradigms in 352 langua...
متن کاملAutomated Paradigm Selection for FSA based Konkani Verb Morphological Analyzer
A Morphological Analyzer is a crucial tool for any language. In popular tools used to build morphological analyzers like XFST, HFST and Apertium’s lttoolbox, the finite state approach is used to sequence input characters. We have used the finite state approach to sequence morphemes instead of characters. In this paper we present the architecture and implementation details of a Corpus assisted F...
متن کاملA Paradigm-Based Finite State Morphological Analyzer for Marathi
A morphological analyzer forms the foundation for many NLP applications of Indian Languages. In this paper, we propose and evaluate the morphological analyzer for Marathi, an inflectional language. The morphological analyzer exploits the efficiency and flexibility offered by finite state machines in modeling the morphotactics while using the well devised system of paradigms to handle the stem a...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017